First-Order Markov Decision Processes

نویسنده

  • Matthew Greig
چکیده

Markov Decision Processes (MDPs) [7] have developed lately as a standard method for representing uncertainty in decision-theoretic planning. Traditional MDP solution techniques have the drawback that they require an explicit state space, limiting their applicability to real-world problems due to the large number of world states occurring in such problems. Recent work addresses this drawback via compactly specifying the state-space in factored form as the set of possible assignments to a set of statevariables [1]. Such Factored MDPs (FMDPs) allow for easily representing exponentially large state spaces. The algorithms for planning using MDPs, however, still run in time polynomial in the size of the statespace or exponential in the number of state-variables, and methods for taking advantage of the structure of the state-space in planning with FMDPs have been proposed in the literature [1] [3]. Many complex systems are naturally represented using some form of relational representation. We propose to investigate methods of incorporating relational representations into the state-space of MDPs to allow their application to complex problems for which it may be impractical to use less expressive representations. The idea of using more expressive representations in MDPs i of course not new to the stochastic planning community--recent work by Boutilier et al. identifies limitations in the applicability of FMDPs to problems with complex state-spaces and proposes First-Order MDPs (FOMDP) based on the situation calculus [2]. This work still has some limitations--needing a separate description for each possible outcome of an action, and requiring all intermediate value functions computed to be piecewise constant. We are currently investigating alternatives to this approach that will avoid the first of these limitations by using Bayesian network (BN) techniques to represent action effects. In FMDPs the action effects are compactly represented using a Two-Stage Temporal Bayesian Network [4] or Dynamic Bayesian Network (DBN) by exploiting the fact that the effects of an action on state-variable often only depend on a small number of other state-variables. Recent work has developed Probabilistic Relational Models (PRMs) for representing probability distributions over relational domains, building on the ideas of BNs [5][6]. PRMs offer an alternative approach to representing action effects in relational MDPs. Our current interest lies in developing a method for representing action effects using a PRM, in much the same manner that DBNs are used. Of prime importance when developing this representation of the transition matrix is to ensure that the structure of the compact representation can also be exploited by solution techniques so as to keep planning tractable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

LIFT-UP: Lifted First-Order Planning Under Uncertainty

We present a new approach for solving first-order Markov decision processes combining first-order state abstraction and heuristic search. In contrast to existing systems, which start with propositionalizing the decision process and then perform state abstraction on its propositionalized version we apply state abstraction directly on the decision process avoiding propositionalization. Secondly, ...

متن کامل

Symbolic Dynamic Programming within the Fluent Calculus

A symbolic dynamic programming approach for modelling first-order Markov decision processes within the fluent calculus is given. Based on an idea initially presented in [3], the major components of Markov decision processes such as the optimal value function and a policy are logically represented. The technique produces a set of first-order formulae with equality that minimally partitions the s...

متن کامل

Symbolic Dynamic Programming

A symbolic dynamic programming approach for solving first-order Markov decision processes within the situation calculus is presented. As an alternative specification language for dynamic worlds the fluent calculus is chosen and the fluent calculus formalization of the symbolic dynamic programming approach is provided. The major constructs of Markov decision processes such as the optimal value f...

متن کامل

Faster Dynamic Programming for Markov Decision Processes

Markov decision processes (MDPs) are a general framework used in artificial intelligence (AI) to model decision theoretic planning problems. Solving real world MDPs has been a major and challenging research topic in the AI literature, since classical dynamic programming algorithms converge slowly. We discuss two approaches in expediting dynamic programming. The first approach combines heuristic...

متن کامل

Learning first-order Markov models for control

First-order Markov models have been successfully applied to many problems, for example in modeling sequential data using Markov chains, and modeling control problems using the Markov decision processes (MDP) formalism. If a first-order Markov model’s parameters are estimated from data, the standard maximum likelihood estimator considers only the first-order (single-step) transitions. But for ma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002